15 research outputs found
Recursive Training of 2D-3D Convolutional Networks for Neuronal Boundary Detection
Efforts to automate the reconstruction of neural circuits from 3D electron
microscopic (EM) brain images are critical for the field of connectomics. An
important computation for reconstruction is the detection of neuronal
boundaries. Images acquired by serial section EM, a leading 3D EM technique,
are highly anisotropic, with inferior quality along the third dimension. For
such images, the 2D max-pooling convolutional network has set the standard for
performance at boundary detection. Here we achieve a substantial gain in
accuracy through three innovations. Following the trend towards deeper networks
for object recognition, we use a much deeper network than previously employed
for boundary detection. Second, we incorporate 3D as well as 2D filters, to
enable computations that use 3D context. Finally, we adopt a recursively
trained architecture in which a first network generates a preliminary boundary
map that is provided as input along with the original image to a second network
that generates a final boundary map. Backpropagation training is accelerated by
ZNN, a new implementation of 3D convolutional networks that uses multicore CPU
parallelism for speed. Our hybrid 2D-3D architecture could be more generally
applicable to other types of anisotropic 3D images, including video, and our
recursive framework for any image labeling problem
PZnet: Efficient 3D ConvNet Inference on Manycore CPUs
Convolutional nets have been shown to achieve state-of-the-art accuracy in
many biomedical image analysis tasks. Many tasks within biomedical analysis
domain involve analyzing volumetric (3D) data acquired by CT, MRI and
Microscopy acquisition methods. To deploy convolutional nets in practical
working systems, it is important to solve the efficient inference problem.
Namely, one should be able to apply an already-trained convolutional network to
many large images using limited computational resources. In this paper we
present PZnet, a CPU-only engine that can be used to perform inference for a
variety of 3D convolutional net architectures. PZNet outperforms MKL-based CPU
implementations of PyTorch and Tensorflow by more than 3.5x for the popular
U-net architecture. Moreover, for 3D convolutions with low featuremap numbers,
cloud CPU inference with PZnet outperfroms cloud GPU inference in terms of cost
efficiency
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
Advanced compiler technology is crucial for enabling machine learning
applications to run on novel hardware, but traditional compilers fail to
deliver performance, popular auto-tuners have long search times and
expert-optimized libraries introduce unsustainable costs. To address this, we
developed LoopTune, a deep reinforcement learning compiler that optimizes
tensor computations in deep learning models for the CPU. LoopTune optimizes
tensor traversal order while using the ultra-fast lightweight code generator
LoopNest to perform hardware-specific optimizations. With a novel graph-based
representation and action space, LoopTune speeds up LoopNest by 3.2x,
generating an order of magnitude faster code than TVM, 2.8x faster than
MetaSchedule, and 1.08x faster than AutoTVM, consistently performing at the
level of the hand-tuned library Numpy. Moreover, LoopTune tunes code in order
of seconds
Recommended from our members
Automated computation of arbor densities: a step toward identifying neuronal cell types
The shape and position of a neuron convey information regarding its molecular and functional identity. The identification of cell types from structure, a classic method, relies on the time-consuming step of arbor tracing. However, as genetic tools and imaging methods make data-driven approaches to neuronal circuit analysis feasible, the need for automated processing increases. Here, we first establish that mouse retinal ganglion cell types can be as precise about distributing their arbor volumes across the inner plexiform layer as they are about distributing the skeletons of the arbors. Then, we describe an automated approach to computing the spatial distribution of the dendritic arbors, or arbor density, with respect to a global depth coordinate based on this observation. Our method involves three-dimensional reconstruction of neuronal arbors by a supervised machine learning algorithm, post-processing of the enhanced stacks to remove somata and isolate the neuron of interest, and registration of neurons to each other using automatically detected arbors of the starburst amacrine interneurons as fiducial markers. In principle, this method could be generalizable to other structures of the CNS, provided that they allow sparse labeling of the cells and contain a reliable axis of spatial reference
Efficient watershed algorithm implementation for large affinity graphs
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 43).In this thesis, I designed and implemented an efficient, parallel, generalized watershed algorithm for hierarchical segmentation of affinity graphs. By introducing four variable parameters the algorithm enables us to use previous knowledge about the input graph in order to achieve better results. The algorithm is very suitable for hierarchical segmentintation of large scale 3D images of the brain tissue obtained by electron microscopy making it an essential tool for reconstructing the brain's neural-networks called connectomes. The algorithm was fully implemented in C++ and tested on a currently largest available affinity graph of size 90GB on which no existent watershed implementation could be applied.by Aleksandar Zlateski.M.Eng
Scalable algorithms for semi-automatic segmentation of electron microscopy images of the brain tissue
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages 139-145).I present a set of fast and scalable algorithms for segmenting very large 3D images of brain tissue. Currently, light and electron microscopy can now produce terascale 3D images within hours. Extracting the information about the shapes and connectivity of the neurons require fast and accurate image segmentation algorithms. Due to the sheer size of the problem, traditional approaches might be computationally infeasible. I focus on an segmentation pipeline that breaks up the segmentation problem into multiple stages, each of which can be improved independently. In the first step of the pipeline, convolutional neural networks are used to predict segment boundaries. Watershed transform is then used to obtain an over-segmentation, which is then reduced using agglomerative clustering algorithms. Finally, manual or computer-assisted proof reading is done by experts. In this thesis, I revisit the traditional approaches for training and applying convolutional neural networks, and propose: - A fast and scalable 3D convolutional network training algorithm suited for multi-core and many-core shared memory machines. The two main quantities of the algorithm are: (1) minimizing the required computation by using FFT-based convolution with memoization, and (2) parallelization approach that can utilize large number of CPUs while minimizing any required synchronization. - A high throughput inference algorithm that can utilize all available computational resources, CPUs and GPUs. I introduce a set of highly parallel algorithms for different layer types and architectures, and show how to combine them to achieve very high throughput. Additionally, I study the theoretical properties of the watershed transform of edge- weighed graphs and propose a liner-time algorithm. I propose a set of modification to the standard algorithm and a quasi-linear agglomerative clustering algorithm that can greatly reduce the over-segmentation produced by the standard watershed algorithm.by Aleksandar Zlateski.Ph. D